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Abstract — Systems that employ network coding for content 
distribution convey to the receivers linear combinations of the 
source packets. If we assume randomized network coding, during 
this process the network nodes collect random subspaces of 
the space spanned by the source packets. We establish several 
fundamental properties of the random subspaces induced in 
such a system, and show that these subspaces implicitly carry 
topological information about the network and its state that can 
be passively collected and inferred. We leverage this information 
towards a number of applications that are interesting in their 
own right, such as topology inference, bottleneck discovery in 
peer-to-peer systems and locating Byzantine attackers. We thus 
argue that, randomized network coding, apart from its better 
known properties for improving information delivery rate, can 
additionally facilitate network management and control. 



I. Introduction 

Randomized network coding offers a promising technique 
for content distribution systems. In randomized network cod- 
ing, each node in the network combines its incoming packets 
randomly and sends them to its neighbours |fl~), Q. This is 
the approach adopted by most practical applications today. 
For example, Avalanche, the first implementation of a peer- 
to-peer (P2P) system that uses network coding, adopts such a 
randomized operation 0, |4]. In ad-hoc wireless and sensor 
networks as well, most proposed protocols employing network 
coding again opt for randomized network operation (see (9) 
and references therein). 

The reason for the popularity of randomized network coding 
is because it facilitates a very simple and flexible network 
operation without need of synchronization among network 
nodes, that is well suited to packet networks. To every packet, 
a coding vector is appended that determines how the packet is 
expressed with respect to the original data packets produced at 
the source node. When intermediate nodes combine packets, 
the coding vector keeps track of the linear combinations 
contained in a particular packet. A receiver which collects 
enough packets, uses the coding vectors to determine the set 
of linear equations it needs to solve in order to recover the 
original data packets. 
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Our contributions start with the observation that coding vec- 
tors implicitly carry information about the network structure 
as well as its stat^. Such vectors belong to appropriately 
defined vector spaces, and we are interested in fundamental 
properties of these (finite-field) vector spaces. In particular, 
since we are investigating properties induced by randomized 
network coding, we need to characterize random subspaces of 
the aforementioned vector spaces. These properties of random 
subspaces over finite fields might be of independent interest. 
We aim to show, using these properties, that observing the 
coding vectors we can passively collect structural and state 
information about a network. We can leverage this information 
towards several applications that are interesting in their own 
merit, such as topology inference, network tomography, and 
network management (we do not claim here the design of 
practical protocols that use these properties). However, we 
show that randomized network coding, apart from its better 
known properties for facilitating information delivery, can 
provide us with information about the network itself. 

To support this claim, we start by studying the problem 
of passive topology inference in a content distribution sys- 
tem where intermediate nodes perform randomized network 
coding. We show that the subspaces nodes collect during the 
dissemination process have a dependence with each other 
which is inherited from the network structure. Using this 
dependence, we describe the conditions that let us perfectly 
reconstruct the topology of a network, if subspaces of all nodes 
at some time instant are available. 

We then investigate a reverse or dual problem of topology 
inference, which is, finding the location of Byzantine attackers. 
In a network coded system, the adversarial nodes in the 
network can disrupt the normal operation of information flow 
by inserting erroneous packets into the network. We use the 
dependence between subspaces gathered by network nodes and 
the topology of the network to extract information about the 
location of attackers. We propose several methods, compare 
them and investigate the conditions that allow us to find the 
location of attackers up to a small uncertainty. 

Finally, we then observe that the received subspaces, even 
at one specific node, reveal some information about the 
network, such as the existence of bottlenecks or congestion. 
We consider P2P networks for content distribution that use 
randomized network coding techniques. It is known that the 
performance of such P2P networks depends critically on the 
good connectivity of the overlay topology. Building on our 

'By state we refer to link or node failures, congestion in some part of the 
network, etc. 
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observation, we propose algorithms for topology management 
to avoid bottlenecks and clustering in network-coded P2P 
systems. The proposed approach is decentralized, inherently 
adapts to the network topology, and reduces substantially the 
number of topology rewirings that are necessary to maintain a 
well connected overlay; moreover, it is integrated in the normal 
content distribution. 

The paper is organized as follows. We start with the notation 
and problem modeling in ijll] We investigate the properties of 
vector spaces in a system that employs randomized network 
coding in fllTTl and these properties give the framework to 
explore applications in ill VI $V] and SjVI] Finally, we conclude 
the paper with a discussion in £1 VIII Shorter versions of these 
results have also appeared in [10|, IfTTI . 0~2). 

A. Related Work 

Network coding started by the work of Ahlswede et al. IfTJI 
who showed that a source can multicast information at a 
rate approaching the smallest min-cut between the source and 
any receiver if the middle nodes in the network combine 
the information packets. Li et al. |[T4l showed that linear 
network coding with finite field size is sufficient for multicast. 
Koetter et al. |fT31 presented an algebraic framework for linear 
network coding. 

Randomized network coding was proposed by Ho et al. |[T6l 
where they showed that randomly choosing the network code 
leads to a valid solution for a multicast problem with high 
probability if the field size is large. It was later applied by 
Chou et al. ||2) to demonstrate the practical aspects of random 
linear network coding. Gkantsidis et al. (3J, El implemented 
a practical file sharing system based on this idea. Several 
other works have also adopted randomized network coding 
for content distribution, see for example 0, J6j, Q- 

Network error correcting codes, that are capable of correct- 
ing errors inserted in the network, have been developed during 
the last few years. For example see the work of Koetter et 
al. ED, Jaggi et al. ED, Ho et al. ED, Yeung et al. ||20l, |2H, 
Zhang |22l . and Silva et al. [23 1. These schemes are capable 
of delivering information despite the presence of Byzantine 
attacks in the network or nodes malfunction, as long as the 
amount of undesired information is limited. These network 
error correcting schemes allow to correct malicious packet 
corruption up to certain rate. In contrast, we use network 
coding to identify malicious nodes in our work. Recently, and 
following our work |12|, additional approaches are proposed 
in the literature, some building on our results ||24l . 

Overlay topology monitoring and management that do not 
employ network coding has been an intensively studied re- 
search topic, see for example ||25l . However, in the context 
of network coding, it is a new area of research. Fragouli et 
al. 11261 . Il27l took advantage of network coding capabilities for 
active link loss network monitoring where the focus was on 
link loss rate inference. Passive inference of link loss rates has 
also been proposed by Ho et al. ||28l . In a subsequent work of 
ours, Sharma et al. [29 1 study passive topology estimation for 
the upstream nodes of every network node. This work is based 
on the assumption that the local coding vectors for each node 



in the network are fixed, generated in advance and known by 
all other nodes in the network, unlike our work that builds 
on randomized operation. The idea of passive inference of 
topological properties from subspaces that are build over time, 
as far as we know, is a novel contribution of this work. 

II. Models: Coding and Network Operation 

A simple observation motivates much of the work presented 
in this paper: the subspaces gathered by the network nodes 
during information dissemination with randomized network 
coding, are not completely random, but have some rela- 
tionship, and this relationship conveys information about the 
network topology as well as its state. We will thus investigate 
properties of the collected subspaces and show how we can 
use them for diverse applications. 

Different properties of the subspaces are relevant to each 
particular application and therefore we will develop a frame- 
work for investigating these properties. This will also involve 
some understanding of modeling the problem to fit the re- 
quirements of an application and then developing subspace 
properties relevant to that model. 

A. Notation 

Let q > 2 be a power of a prime. In this paper, all vectors 
and matrices have elements in a finite field ¥ q . We use ¥ q xm 
to denote the set of all n x m matrices over ¥ q , and to 
denote the set of all row vectors of length £. The set ¥ e q forms 
an ^-dimensional vector space over the field ¥ q . Note that all 
vectors are row vectors unless otherwise stated. Bold lower- 
case letters, e.g., v, are used for vectors and bold capital letters, 
e.g., X, are used to denote matrices. 

For a set of vectors {v±, . . . , v^} we denote their linear span 
by (vi, . . . , Vh). For a matrix X, (X) is the subspace spanned 
by the rows of X. We then have rank(X) = dim((X)). 

We denote subspaces of a vector space by n and sometimes 
also by it. In this paper, we work on a vector space F^ of 
dimension £ defined over a finite field F g . For two subspaces 
ni , n 2 Q ¥ e q , we will denote their intersection by ni n n2 
and their joint span by ni + n 2 where 

ni +n 2 = {«i + v 2 \vi e u ± ,v 2 e n 2 }, 

is the smallest subspace that contains both ni and n 2 . It is 
well known that 

dim(Tli + n 2 ) = dim(ni) + dim(n 2 ) - dirndl n n 2 ). 

We also use the following metric to measure the distance 
between two subspaces, 

dsiUx, n 2 ) 4 dim(ni + n 2 ) - dim(n! n n 2 ) (i) 

= dim(Tli) +dim(n 2 ) - 2 dirndl n LT 2 ). 

This metric was also introduced in IfTTI . where it was used to 
design error correction codes. 

In addition to the metric ds(',') defined above, in some 
cases we will also need a measure that compares how a set A 
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of subspaces differs from another set B of subspaces. For this 
we will use the average pair-wise distance defined as follows 

WB)'nk Y, d s { Wb ). (2) 



It should be noted that the above relation does not define a 
metric for the set of subspaces because the self distance of 
a set with itself is not zero. However, D $(-,■) satisfies the 
triangle inequality. 

In this paper we will be interested in investigating the 
relationship of the collected subspaces at neighboring network 
nodes. We consider a network represented as a directed acyclic 
graph G — (V,E), with $ = \V\ nodes and £ = \E\ edges. 
For an arbitrary edge e = (u, v) G E, we denote head(e) = v 
and tail(e) = u. For an arbitrary node v G V, we denote 
ln(v) the set of incoming edges to v and Out(w) the set of 
outgoing edges from v. If a node u has p parents u\, . . . , u p , 
we denote with P(u) = {u\, . . . , u p } the set of parents of u. 
We use P l (u) to denote the set of all ancestors of u at distance 
I from u in the network (we say that two nodes u and v are 
at distance I if there exists a path of length exactly I that 
connects them). We denote with ttu (t) the subspace node u 
receives from parent Ui at exactly time t, and with n u (t) the 
whole subspace (from all parents) that node u receives at 
time t, that is 7r n 

(*) = Ei=i 7r « , ' <) (*)- We also denote with 

the subspace node u has received from parent Ui up 
to time t, that is, Il!" l) (<) = U { u l \t - 1) + 7r£" s) (£). Then the 
subspace H u (t) that the node has at time t can be expressed as 
n«(£) = Si=i n£"^(t). For a set of nodes W = {u\, . . . , 



we define IT; 



u 



n. 



Finally, we use the big O notation which is defined as 
follows. Let f(x) and g(x) be two functions defined on some 
subset of the real numbers. We write f{x) = 0(g(x)) if 
and only if there exists a positive real number M and a real 
number xq such that |/(a;)| < M\g(x) \ for all x > xq. During 
the rest of the paper we use O to compare functions of the 
field size q, unless otherwise stated. For example, we will use 
f(q) = 0(q^ 1 ) to imply that the value of f(q) goes to zero 
as q^ 1 for q — >• oo. 



B. Network Operation 

We assume that there is an information source located on a 
node S that has a set of n packets (messages) {xi, . . . ,x n }, 
Xi G F*, to distribute to a set of receivers, where each 
packet is a sequence of £ symbols over the finite field ¥ q . 
To do so, we will employ a dissemination protocol based on 
randomized network coding, namely, where each network node 
sends random linear combinations (chosen to be uniform over 
¥ q ) of its collected packets to its neighbors. We assume for 
simplicity that there are no packet-losses. 

Dissemination Protocol 

It is possible to separate the dissemination protocols into 
the following operation categories. 

• Synchronous: All nodes are synchronized and transmit 
to their neighbors according to a global clock tick (time- 
slot). At timeslot t G N, node v sends linear combinations 



from all vectors it has collected up to time t — 1. Once 
nodes start transmitting information, they keep transmit- 
ting until all receivers are able to decode. 
• Asynchronous: Nodes transmit linear combinations at 

randomly and independently chosen time instants. 
In this paper, we focus on the synchronous network where 
we assume that each link has unit dela}0 corresponding to each 
timeslot, however our results can be extended to asynchronous 
networks as well. 

Next, we explain in detail the dissemination protocol, that 
is summarized in Algorithm III. II 

Timing: We depict in Fig. [T] the relative timing of 
events within a timeslot. Nodes transmit at the beginning of 
a timeslot. We assume that each packet is received by its 
intended receiver before the end of the timeslot. Thus, the 
timeslot duration incorporates the packet propagation delay in 
one edge of the network. 



u A (t - 1) 



■ A Transmits 



© 



Slot number t 



■ B Receives 



J The point that subspaces 
I are measured: I1b (t) 



t - 1 t llme 

Fig. 1. Timing schedule of the dissemination protocol given by Algo- 
ritrmllUl 



Rate Allocation and Equivalent Network Graph: The 

dissemination protocol first associates with each link e of 
the network a rate r e (measured as the number of packets 
transmitted per timeslot on edge e). These rates are selected 
in advance using a rate allocation method, for example |8|. 

For the rest of the paper, we consider an equivalent network 
graph, where each edge e has capacity equal to its allocated 
rate r e . On this new graph, we can define the min-cut c v from 
the source node S to a node v G V. Whenever we refer to 
min-cut values in the following, we imply min-cut values over 
this equivalent graph. 

We assume that the rate allocation protocol we use satisfies 



rv < min c R ,c 



e; t-tail(e)J i 



(3) 



where c e is the capacity of edge e. This very mild assumption 
says that the node v = tail(e) does not send more information 
than it receives, and is satisfied by all protocols that do not 
send redundant packets, i.e., observe flow conservation. 

In our work, we consider the case where n 3> c v , namely, 
the dissemination of the n source packets to the receivers takes 
place by using the network over several timeslots. 

Node operation: When the dissemination starts, at times- 
lot say zero, the source starts transmitting at each time slot 
and to each of its outgoing edges e, r e randomly selected 
linear combinations of n information packets. We will call r$ 
the source rate. The source continues until it has transmitted 
linear combinations of all n packets, i.e., for — timeslots. 

2 Unit delay can model a buffering window a node needs to wait to collect 
packets from all its neighbors. 
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Every other node v £ F\{S'} in the network, operates as 
follows: 



Initially it does not transmit, but only collects in a buffer 
packets from its parents, until a time r v , which we call 
waiting time and we will define in the following. As we 
will see, each node can decide the waiting time by itself 
and independently from other nodes. 
At each timeslot t, for all t > t v + 1, it transmits to each 
outgoing edge e, r e linear combinations of all packets it 
has collected in its buffer up to time t — 1. 



Algorithm JL1: DisseminationProtocol(G = 

(V,E),S,n,T v ,r e ) 

for each dg^\{S} 
do n„(0) = 0,4(0) = 

t 4r- 1 

while min^ d v (t) < n 
for each v eV 
if t > t„ + 1 

{for each e e Out(u) 
^ fnode v transmits from 
— 1) with rate r e on e 

for each v e V 

do update n„ (t) , d v (t) 



Alg. III. II Dissemination protocol. 



Collected Subspaces: We can think of each of the n 
source messages {a?i} as corresponding to one dimension of 
an n-dimensional space lis C where lis = (x%, . . . , x n ). 
We say that node v € V at time t observes a subspace II„ (t) C 
lis, with dimension d v {t) = dim(ILj(t)), if Ii v (t) is the space 
spanned by the received vectors at node v up to time t. Initially, 
at time t = 0, the collected subspaces of all nodes (apart the 
source) are empty; d v (0) = 0, Vv 6 V \ {S}. 



Waiting Times: We next define the waiting times, that 
will be used in the following sections to ensure that the 
subspaces of different nodes be distinct, and are a usual 
assumption in dissemination protocols; indeed, for large n 
the waiting time does not affect the rate. For example, in the 
information-theoretic proof of the main theorem in network 
coding 1 13 1, each node waits until it collects at least one 
message from each of its incoming links before starting 
transmissions. 



Definition 1: The waiting time t v for a node v is the first 
timeslot during which node v receives information from the 
source at a rate equal to its min-cut c v , and additionally, has 
collected in its buffer a subspace of dimension at least c v + 1. 

Note that, because we are dealing with acyclic graphs, we 
can impose a partial order on the waiting times of the nodes, 
such that all parents of a node have smaller waiting time 
than the node. Moreover, each node can decide whether the 
conditions for the waiting time are met, by observing whether 
it receives information at a rate equal to its min-cut, and what 
is the dimension of the subspace it has collected. That is, 
a node does not need to know any topological information 
(apart from its min-cut), and the waiting times do not need to 
be communicated in advance to the nodes, but can be decided 
online based on the network conditions. 



Source Operation and the Source Subspace II5 

As we discussed, the source needs to convey to the receivers 
n source packets that span the n-dimensional subspace II5 = 
(xi, . . . ,x n ), with lis Q V q . lis is isomorphic to F"; thus, 
for the purpose of studying relationships between subspaces 
of lis, we can equivalently assume that lis = and that 
node v e V at time t observes a subspace H v (t) C lis. This 
simplification is very natural in the case where we employ 
coding vectors, reviewed briefly in the following, as we only 
need consider the coding vectors for our purposes and ignore 
the remaining contents of the packets; however, we can also 
use the same approach in the case where the source performs 
noncoherent coding, described subsequently. 

a) Use of Coding Vectors: To enable receivers to decode, 
the source assigns n symbols of each message vector (packet) 
to determine the linear relation between that packet and the 
original vectors Xi, i = 1, . . . , n. Without loss of generality, let 
us assume these n symbols (which form a vector of length ri) 
are placed at the beginning of each message vector. This vector 
is called coding vector. Each message vector x t contains two 
parts. The vector xf £ F^ 1 with length n is the coding vector 
and remaining part, x\ G F e q ~ n , is the information part where 

X/j — I ] " 

The coding vectors xf, i = 1, . . . , n are chosen such that they 
form a basis for F™. For simplicity we assume xf — where 
&i G F^ 1 is a vector with one at position i and zero elsewhere. 

For our purposes, it is sufficient to restrict our algorithms 
to examine the coding vectors. Thus, the source has the space 
lis = F"; during the information dissemination, if a node v 
at time t has collected m packets Zi with coding vectors zp, 
it has observed the subspace H v (t) = (zf , . . . , z^), In other 
words, the coding vectors capture all the information we need 
for our applications. 

b) Subspace Coding: Our approach also works in the 
case of subspace coding, that was introduced in [17|. We next 
briefly explain the idea of communication using subspaces, in 
a network performing randomized network coding. 

In the following, we use the same notation as introduced 
in Hll-B\ Let {xx,...,x n }, Xi 6 F^ denote the set of 
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packets the source has. Assume that there is no error in 
the network. An arbitrary receiver R v at node v collects m 
packets Zi, i = l,...,m, where each z< can be presented 
as Zi = Yjj=i hijXj, The coefficients hij are unknown and 
randomly chosen over ¥ q . In matrix form, the transmission 
model can be represented as 

Z v = HsvX, 

where H g v £ F™ XTl is a random matrix and X e F«x^ j s 
the matrix whose rows are the sources' packets. 

The matrices Hgv are randomly chosen, under constraints 
imposed by the network topology. As stated in ifTTl and proved 
in ||30l , ||3T1 , (32), the above model naturally leads to consider 
information transmission not via the choice of Xi but rather 
by the choice of the vector space spanned by {x^. 

In the case of subspace coding, the dissemination algorithm 
works in exactly the same way as in the case of coding vectors; 
what changes is how the source maps the information to the 
packets it transmits, and how decoding occurs. However, this is 
orthogonal to our purposes, since we perform no decoding of 
the information messages, but simply observe the relationship 
between the subspaces different nodes in the network collect. 
Thus, the same approach can be applied in this case as well. 

C. Input to Algorithms 

We are interested in designing algorithms that leverage the 
relationships between subspaces observed at different network 
nodes for network management and control. The algorithms 
design will depend on the information that we have access to. 
We distinguish between the following. 

> Global information: A central entity knows the subspaces 
that all nodes in the network have observed. 

• Local Information: There is no such omniscient entity, 
and each node v only knows what it has received, its 
own subspace n„. 
We may also have information between these two extreme 
cases. Moreover, we may have a static view, where we take 
a snapshot of the network at a given time instant t, or a non- 
static view, where we take several snapshots of the network 
and use the subspaces' evolution to design an algorithm. 

We will argue in Section [IV] that capturing even global 
information can be accomplished with relatively low overhead 
(sending one additional packet per node at the end of the 
dissemination protocol); thus, the algorithms we develop even 
assuming global information can in fact be implemented 
almost passively and at low cost. 

III. Properties of Random Vector Spaces over a 
Finite Field F™ 

In this section, we will state and prove basic properties 
and results that we will exploit towards various applications 
in the following sections. In particular, we will investigate 
the properties of random sampling from vector spaces over 
a finite field. Such properties give us a better insight and 
understanding of randomized network coding and form a 
foundation for the results and algorithms presented in this 
paper. 



A. Sampling Subspaces over F™ 

Here, we explore properties of randomly sampled subspaces 
from a vector space F™. We start with the following lemma 
that explores properties of a single subspace. 

Lemma 1: Suppose we choose m vectors from an n- 
dimensional vector space lis = F™ uniformly at random to 
construct a space n. Then the subspace n will be full rank (has 
dimension min[m, n]) w.h.p. (with high probability Jj namely, 

P[dim(n) =min[m,n]] = [I - O (q^ 1 )}. 

Proof: Refer to Appendix [A] ■ 
We conclude that for large values of q, selecting m < n 
vectors uniformly at random from F™ to construct a subspace 
n is equivalent to choosing an m-dimensional subspace from 
F™ uniformly at random. Note that this is not true for small 
values of q. 

We next examine connections between multiple subspaces. 

Lemma 2: Let Hi and II2 be two subspaces of Us = F™ 
with dimension d% and d-i respectively, intersection of di- 
mension d\i and ni <t H2 (i.e., d\i < di). Construct n'i 
by choosing m vectors from ni uniformly at random. Then 

rpjcn^off™). 

Proof: Refer to Appendix [A] ■ 
Lemma 3: Suppose Hk is a fc-dimensional subspace of a 
vector space Us — F™. Select m vectors uniformly at random 
from ns to construct the subspace n. We have 

dim(nnn fc ) = min[fc, (m - (n - k)) + ] 

= (min[m, n] + k — n) + , (4) 

with probability 1 — O (q^ 1 )- 

Proof: Refer to Appendix [A] ■ 
Corollary 1: Suppose Hi and H2 are two subspace of F™ 
with dimension di and d^ respectively and joint dimension 
di2- Let us take mi vectors uniformly at random from Hi 
and 771 2 vectors from H2 to construct subspaces fii and II2. 
We have 

dim(ITi ("1 II2) =min [d%2, (mi + m 2 - (di + d 2 - di2)) + , 
(mi - (di - di 2 )) + , (m 2 - (d 2 - di 2 )) + ] , 

with probability 1 — (q 1 )- 

Proof: Refer to Appendix [A] ■ 

By choosing ni = n2 = F^ 1 in Corollary Q] we have the 
following corollary. 

Corollary 2: Let us construct two subspaces IIi and II2 by 
choosing mi and 7712 vectors uniformly at random respectively 
from F£. Then the subspaces IIi and II2 will be disjoint with 
probability 1 — (q^ 1 ) if m\ + m 2 < n. 

We are now ready to discuss one of the important properties 
of randomly chosen subspaces which is very useful for our 
work: randomly selected subspaces tend to be "as far as 
possible". We will clarify and make precise what we mean 
by "as far as possible", see also |[33l . We first review the 

3 Throughout this paper, when we talk about an event occurring with high 
probability, we mean that its probability behaves like 1 — O (q~ 1 ), which 
goes to 1 as q — > 00. 
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definition of a subspace in general position with respect to a 
family of subspaces. 

Definition 2 ( fi33\ Chapter 3]): Let 1I5 be an n- 
dimensional space over the field ¥ q and for i = 1, . . . , r, let 
Hi be a subspace of lis, with dim(IL;) = di. A subspace 
II C II5 of dimension d is in general position with respect to 
the family {IT} if 

dim(iTi n II) = max [di + d - n, 0] , Vi € {1, . . . , r}. (5) 

It should be noted that max[di + d — n, 0] is the minimum 
possible dimension of (II,; nil). So what the above definition 
says is that the intersection of IT and each IT is as small as 
possible. Using the above definition we can state the following 
theorem^. 

Theorem 1: Suppose {IL}, i = l,...,r, are subspaces 
of lis == IF™. Let us construct a subspace II by randomly 
choosing m vectors from IT5. Then IT will be in general 
position with respect to the family {IT;} w.h.p., i.e., with 
probability 1-0 (q^ 1 )- 

Proof: Refer to Appendix [A] ■ 

Theorem Q] demonstrates a nice property of randomized net- 
work coding where the subspaces spanned by coding vectors 
tend to be as far as possible on different paths of the network. 

B. Rate of Innovative Packets 

In the following sections, we will need to know the rate 
of receiving innovative message vectors (packets) at receivers 
in a dissemination protocol performing randomized network 
coding. By innovative we refer to vectors that do not belong 
in the space spanned by already collected packets. As it is 
shown in lfT3l . the source can multicast at rate equal to the 
minimum min-cut of all receivers if the intermediate nodes 
can combine the incoming messages. Moreover, it is shown 
in |[T4l that using linear combinations is sufficient to achieve 
information transfer at a rate equal to the minimum mincut 
of all receivers. In fl3l . (TJ, it is also demonstrated that 
choosing the coefficients of the linear combinations randomly 
is sufficient (no network-specific code design is required) with 
high probability if the field size is large enough. 

To find the rate of receiving information at each node where 
the implemented dissemination protocol performs randomized 
network coding, we can use the following result given in 
Theorem [2] Note that our described dissemination protocol, 
although very common in practice, does not exactly fit to 
the previous theoretical results in the literature that examine 
rates, because the operation of the network nodes is not 
memory-less. That is, while for example in Q), 0, 11141 each 
transmitted packet at time t is a function of a small subset of 
the received packets up to time t (the ones corresponding to the 
same information message), in our case a packet transmitted at 
time t is a random linear combination of all packets received 
up to time t. This small variant of the main theorem on 
randomized network coding is very intuitive, and we formally 
state it in following. 

4 Versions of this theorem can be easily derived from results in the literature 
1331 . but we repeat here the short derivation for completeness 



Theorem 2: Consider a source that transmits n packets 
over a connected network using the dissemination protocol 
described in §II-BI and assume that the network nodes perform 
random linear network coding over a sufficiently large finite 
field. Then there exists to such that for all t > to each node 
v in the network receives c v independent linear combinations 
of the n source packets per time slot, where c v = mincut(ii). 
Proof: Refer to Appendix IB-AI ■ 

Given Theorem |2] we can state the following definition. 

Definition 3: For a specific information dissemination pro- 
tocol over a network, we define the steady state as the time 
period during which each node v in the network receives 
exactly c„ independent linear combinations of the n source 
packets per time slot and none of the nodes, except source 
S, has collected n linearly independent combinations. We call 
the time that the network enters steady state phase the steady 
state starting time and denote it by T s . If the network never 
attains the steady state phase then we use T s = 00. 

For our protocol in §II-BI T s depends not only on the 
network topology, but also on the waiting times t v . For the 
waiting time defined in Definition [T] we can upper bound T s 
as stated in Lemma [4] 

Lemma 4: If n is large enough, for the dissemination 
protocol given in §II-BI we may upper bound the steady state 
starting time as follows 

T s < 2D{G) - 1, 

where D(G) is the longest path from the source to other nodes 
in the network 

Proof: Refer to Appendix [A] ■ 

In order to be sure that the dissemination protocol given in 
f II-B] enters the steady state phase, n should be large enough. 
Using Lemma |4] we have the following result, Corollary [3] 

Corollary 3: A sufficient condition for n to be sure that the 
protocol enters the steady state is that 

2D(G)-1< L— J, 

Cmax 

where c max = max„ e y c v . 

IV. Topology Inference 

In this section, we will use the tools developed in fllTTl to 
investigate the relation between the network topology and the 
subspaces collected at the nodes during information dissemi- 
nation. We will develop conditions that allow us to passively 
infer the network topology with (asymptotically on the value 
of q) no error. The proposed scheme is passive in the sense 
that it does not alter the normal data flow of the network, 
and the information rates that can be achieved. In fact, we 
can think of our protocol as identifying the topology of the 
network which is induced by the traffic. 

We build our intuition starting from information dissem- 
ination in tree topologies, and then extend our results in 
arbitrary topologies. Note that information dissemination using 
network coding in tree topologies does not offer throughput 

5 Note that D(G) is different from the longest shortest path which is called 
diameter of G in the graph theory literature. 
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benefits as compared to routing; however, it is an interesting 
case study that will naturally lead to our framework for 
general topologies. We then define conditions under which 
the topology of a tree and that of an arbitrary network can be 
uniquely identified using the observed subspaces. Note that 
uniquely identifying the topology is a strong requirement, as 
the number of topologies for a given number of network nodes 
is exponential in the number of nodes. 

A. Tree Topologies 

Let G = (V, E) be a network that is a directed tree of 
depth D{G), rooted at the source node S. We will present 
(i) necessary and sufficient conditions under which the tree 
topology can be uniquely identified, and (ii) given that these 
conditions are satisfied, algorithms that allow us to do so. 

We first consider trees where each edge is allocated the 
same rate c, and thus the min-cut from the source to each 
node of the tree equals c. We then briefly discuss the case 
of undirected trees. Finally we examine the case where edges 
are allocated different rates, and thus nodes may have different 
min-cuts from the source. 

1) Common Min-Cut: Assume that each edge of the tree 
has the same capacity c {i.e., a rate allocation algorithm has 
assigned the same rate r e = c on each edge of the tree). 
Thus all nodes in the tree have the same min-cut, equal to 
c. Then according to the dissemination protocol introduced 
in Algorithm III. II each node v will wait time r v , until it 
has collected a c + 1 dimensional subspace, and then start 
transmitting to its children. Our claim is that, we can then 
identify the network topology using a single snapshot of all 
node's subspaces at a time t. Before formally proving the result 
in Theorem |3] we will give some intuition on why this is so, 
and why the waiting time is crucial to achieve this. We start 
from an example on the simple network in Figure [2] 

Example 1: Consider the tree in Figure [2] and assume that 
the edges have unit capacity (c = 1). Algorithm III. II works 
as follows. At time t = 1, node A receives a vector y\ from 
the source S. Node A waits, as it has not yet collected a 
c + 1 = 2 dimensional subspace. At time t = 2, it receives a 
vector 1/2- It now has collected the subspace 11,4(2) = (2/1,2/2}, 
and thus at the next timeslot it will start transmitting. At time 
t = 3, node A transmits vectors and vf( to nodes B and 
C respectively, with yf ,yf G ILt(2). Thus II S (3) = (yf) 
and nc(3) = (2/1 )• Node A also receives a vector 2/3 from 
the source, and thus ILa(3) = (2/1, 2/2; 2/3}- Consider now the 
subspaces ILa(3), n B (3) and n c (3). We see that n B (3) C 
ILa(3), and lie (3) C 11^(3); we thus conclude that nodes B 
and C are children of node A. Moreover, IIb(3) ^ IIc(3), 
which will allow us to distinguish between children of these 
two nodes when we deal with larger trees. 

In contrast, if Algorithm III. 1 I did not impose a waiting time, 
and node A started transmitting to nodes B and C at time 
t = 2, then both nodes B and C would receive the same vector 
2/1, i.e., IIb(2) = IIc(2) = (j/i). In fact, at all subsequent 
times, we will have that Tl B (t) = Uc(t) =U A (t -1). Thus, 
we would not be able to distinguish between these two nodes. 




Fig. 2. Directed tree with four nodes rooted at the source S. 

The main idea in our result is that, if we consider two nodes 
u and v at the network which have collected subspaces II„(f ) 
and H v (t) at time t, then, unless u and v have a child-ancestor 
relationship (i.e., are on the same branch in the tree), it holds 
that II„(*) £ n„(t) and Il v (t) £ Il u (t). 

The challenge in proving this is that we deal with subspaces 
evolving over time, and thus we cannot directly apply the 
results in fllTTl For example, for the network in Figure [2] 
ils(i) and Hc(t) are not subspaces that are selected uniformly 
at random from Il^i); instead, they are build over time as 
HA(t) also evolves. We will thus need the following two 
results, that modify the results in t flFTl to take into account 
the time evolution in the creation of the subspaces. We start 
by examining in Lemma [5] the relationship between subspaces 
collected at the immediate children of a given parent node 
(for example, at the children B and C of node A). These 
are created by sampling the same subspaces (those at node 
^4). We then examine in Corollary [4] the relationship between 
subspaces collected at nodes that have different parents (for 
example, a node that has B as parent and a node that has C 
as parent). 

Lemma 5: Suppose there exist (proper) subspaces 11(0) C 
il(l) C •■■ C n(i — 1) with dimensions do, . . . , dt-i, 
respectively. Let us construct the set of subspaces H u (i), 
i = l,...,t, as follows. Set H u (i) — X)j=i 7r «(j) where 
TT u (j) is the span of k u (j) vectors chosen uniformly at random 
fromII(j — 1) such that k u (l) < do and k u (j) < (dj-i—dj-2) 
for j = 2, . . . , t. Similarly, we construct the set of subspaces 
n«(*) = E}=i 7r i'0) w h ere f° r k v (j) we have similar 
conditions, namely, k v (l) < do and k v (j) < (dj-i — dj-2) 
for j = 2, . . . , t. Then we have 

n„(i)^n„(i) and u v (j)^u u (i) Vi,j e{i,...,t}, 

with high probability. 

Proof: Refer to Appendix [A] ■ 
Corollary 4: Suppose that there exist two set of subspaces 
{n^)}^ and {IL v (i)}lzl such that n„(0)c-cn„(f- 
1) and ILj(O) C ••• C H v (t — 1). Moreover, assume that 
n„(i) £ U v (j) and II„(j) g IL u (i) V*,j G {0,...,t- 
1}. Now, construct two set of subspaces {II a (i)}* =1 and 
{il b (i)}* =1 by setting II Q (i) = X)} =1 7r (?) and II b (i) = 

J2]=i Kb{j) where n a (i) is chosen uniformly at random from 
n„(i — 1) and 7Tb (i) is chosen uniformly at random from 



s 



ILj(i — 1) (with some arbitrary dimension). Then we have 

n a (i) £n 6 (j) and n 6 (j) £ n tt (i) v*,j€{i,...,t}, 

with high probability. 

Proof: Refer to Appendix lAl ■ 

Theorem 3: Consider a tree of depth D(G) where each 
edge has capacity c, and the dissemination Algorithm III. II 
A static global view of the network at time t, with 
2D(G) — 1 < t < |_— J, allows to uniquely determine the tree 
structure with high probability, if the waiting times are chosen 
according to Definition Q] 

Proof: We will say that a node of the tree is at level I 
if it has distance I from the source. In a tree there exists a 
unique path T u = {S, P u_1 (it), . . . , P(u), u] from source S 
to node u at level l u of the network. 

If we consider a time t in steady state (where all nodes 
have nonempty subspaces and none has collected the whole 
space), then clearly using Algorithm III. 1 1 for dissemination in 
the network for the nodes along the path V u it holds that 

n*(t) c n P(M) (t) c • • • c n P i u - 1(u) (t) c n s . (6) 

Note that the conditions on t ensure that the network is in 
steady-state. 

To identify the topology of the tree it is sufficient to show 
that n u (i) ^ II„(i) for any node v that is not in V u . Let l u 
and l v be the distance of u and v from the source, respectively. 

First, we observe that, starting from the source, by applying 
Lemma [3] and Corollary [4] and because of Definition Q] the 
subspaces of the nodes at the same level (same distance from 
the source) are different at all times. So it only remains to 
check the condition H u (t) ^ H v (t) for those node v that are 
not in the same level as u. 

Consider two cases. First, if l u < l v then let v' be the 
ancestor of v at the same level as u. By Corollary [4] we have 
n„(t) £ ILy(t) so n„(t) IL v (t) because Tl v (i) C ILy(t). 

Now consider the second case, l u > l v . We start by assum- 
ing n u (t) C II t ,(i) and then we will show that this assumption 
leads to a contradiction. Let u' be the ancestor of u at the same 
level of v. Then we make the following observation. If at time 
t we have H u (t) C H v (t) by Lemma |2] we should have had 
U P(u) (t-l) C n„(t) andson P 2 (u) (t-2) C Tl v (t) and finally 
we should had had H u >(t — l u + l v ) C ILj(t). But according 
to Corollary [4] this is a contradiction because u' and u are at 
the same level. 

In the above argument, we have shown that Hp( u ) (t) is the 
smallest subspace contains H u (t) among all nodes' subspaces 
at time t. So we are done. ■ 

Assume now that Theorem [3] holds. To determine the tree 
structure, it is sufficient to determine the unique parent each 
node has. From the previous arguments, the parent of node u is 
the unique node v such that II„ (t) is the minimum dimension 
subspace that contains H u (t). Then, the parent of node u is 
the node v such that 

v = argmin d w . 

As we will discuss in Section IIV-CI collecting the subspace 
information from the network nodes can be implemented 



efficiently. The algorithm that determines the tree topology 
reduces this information to only two "sufficient statistics": 
the dimension of each subspace d u — dim(II tl ), Vu £ V, 
and the dimension of the intersection of every two subspaces 
d U v = dim(II tl (~l lit,), Mu,v e V, as described in Algo- 
rithm II V. II assuming that the conditions of Theorem [3] hold. 



Algorithm IV. 1: TREE({d„}, {d uv }) 


for each u eV 




' if d u = n 




then u ^— S 


do < 


(node u has parent the node v witl 




else < v — argmin d w 




{ weV: d uu ,=d u 



Alg. II V. 1 1 Find the network topology for a tree. 



2) Directed v.s. Undirected Network: In a tree with a single 
source, since new information can only flow from the source to 
each node along a single path, whether the network is directed 
or undirected makes no difference. In other words, from ©, 
all vectors that a node will send to its predecessor will belong 
in the subspace the predecessor already has. Thus Theorem [3] 
still holds for undirected networks with a common mincut. 

3) Different Min-Cuts: Assume now that the edges of the 
tree have different capacities, i.e., assigned different rates. In 
this case, the proof of Theorem [3] still holds, provided that the 
condition in Theorem [3] is modified to 

Ti 

2D(G) -Kt< [ J, 

Cmax 

where c max = max„ £ v c v . 

We underline that this theorem would not hold without the 
assumption in (O . Without this condition, it is possible that 
we cannot distinguish between nodes at same level with a 
common parent as explained in the following example. 

Example 2: If in the network in Figure [2] edge SA has 
unit capacity, while edge AB and AC have capacity two. 
In this case it is easy to see that there exists to suc h that 
IL B (t) = Uc(t) = II A (t - 1), Vf > Iq. Clearly in this case, 
we cannot distinguish between nodes B and C with this 
dissemination protocol. ■ 

B. General Topologies 

Consider now an arbitrary network topology, corresponding 
to a directed acyclic graph. An intuition we can get from 
examining tree structures is that, we can distinguish between 
two topologies provided all node subspaces are distinct. This 
is used to identify the unique parent of each node. In general 
topologies, it is similarly sufficient to identify the parents of 
each node, in order to learn the graph topology. The following 
theorem claims that having distinct subspaces is in fact a 
sufficient condition for topology identifiability over general 
graphs as well. 

Theorem 4: In a synchronous network employing random- 
ized network coding over ¥ q , a sufficient condition to uniquely 
identify the topology with high probability as q 3> 1, is that 

U u (t)^U v (t) Vu,veV, u^v, (7) 
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for some time t. Under this condition, we can identify the 
topology by collecting global information at times t and i + 
i.e., two consecutive static views of the network. 

Proof: Assume node u has the p parents P(u) = 
{u\, . . . , Up}. Let ILu tl \t), . . . , n!"^ (t) denote the subspaces 
node u has received from its parents up to time t, where 
n«(£) = J2i=i n« (t). From construction it is clear that 

ni u *\t + i) cn„,((). 

To identify the network topology, it is sufficient to decide 
which node v 6 V is the parent that sent the subspace Ili" 1 ' 1 (t) 
to node u for each i, and thus find the p parents of node u. 
We claim that, provided (0 holds, node u has as parent the 
node v which at time t has the smallest dimension subspace 
containing liu*\t + 1). Thus we can uniquely identify the 
network topology, by two static views, at times t and t + 1, 
as Algorithm IIV.2I describes. 

Indeed, let ^^(t) denote the subspace that node 
u receives from parent Ui at exactly time t, that 
is, U^'Ht + 1) = + Ti { u l \t + 1). For each i G 

{1, . . . ,p}, if nt'^t + 1) g U v {t) for all v G V \ {u t }, 
clearly III"* 5 + 1) II„ (i) for all v G V \ {u,}, and we are 
done. Otherwise, using Lemma [2] and because (0 holds, with 
high probability we have 7ri Ul) (t + 1) £ n„(t) for all w G V 
except those nodes that their subspaces contain H Ui (t). So we 
are done. ■ 

Note that to identify the network topology, we need to 
know, for all nodes u, the dimension d u = dim(n M (t)) of 
their observed subspaces at time t, the dimension du = 
dim(ni"*' ) [t + 1)) for all parents Ui of node u, and the dimen- 
sion of the intersection of Ili"*'' (t + 1) with all H w (t), w £ V, 
denoted as dti = dim(nl Ul) (t + 1) n U w (t)). Algorithm|lV2] 
uses this information to infer the topology. 



Algorithm IV.2: Gen(K} , {d^}, {di%}) 


for each u eV 








fif d u = 


n 






then u <— S 








for each i G {1, . . . ,p u } 


do < 






'node u has as parent the 




else < 


:°' 


node v with 






v = argmin d w 











Alg. IIV.2I Find the topology of a general network. 



The sufficient conditions in Theorem |4] may or may not 
hold, depending on the network topology and the information 
dissemination protocol. Next, we will investigate for what net- 
work topologies the conditions (0 hold for the dissemination 
Algorithm III. 11 so that the network is identifiable. 

Lemma 6: Consider two arbitrary nodes u and v, where 
P{u) = {u\, . . . , u Pu } and P(v) — {vi, . . . , v Pv } are the par- 
ents of u and v respectively. Let (t— 1) = X)f=i n tli (t— 

1), and U P(v) (t - 1) = Eti^it - 1). if n u (f) = n„(t) 

we should have had Hp^(t — 1) = IIp( 1 ,)(t — 1) w.h.p. 

Proof: Suppose H P ^(t — 1) ^ ITp^^t — 1) and let us 
assume that H u (t) — H v (t) = II. This implies that if ir u (t) 



and tt v (t) are subspaces collected by nodes u and v at time t 
then, 

U u (t)^U v (t)=U 

M*) + n «( i - !) = + n «( < - !)■ 
From construction, we have IT = LT^ (t) C H P / u \(t — 1) and 

n = n 8 (i)cn JJ(8) (t-i), 

On the other hand, since we randomly chose Trl^ z \t) from 
n Ui (t — 1) and since ^^(t) C IT (because ir u (t) C IT) using 
Lemma|2]we conclude that we should have that IT Mi (t— 1) C IT 
which means we should have ITp^) (t — 1) C IT. Similarly, we 
should have H P ^(t — 1) C IT. As a result (w.h.p.) we have 
to have 

n P(t , ) (t-i) = n PM (t-i) = n ) 

which is a contradiction, so we are done. ■ 
Corollary 5: If U u (t) = U v (t) = IT for t > I we should 
have had ITpi — I) = ITpi^^t — Z) = IT, w.h.p. 

Proof: Consider the parents of nodes u and w as supern- 
odes P(u) and P(v). Using a similar argument as stated in 
Lemma|6] we can conclude that the parents of P(u) and P(v), 
denoted as P 2 (u) and P 2 (v), should satisfy 

n P 2 (u) (t-2) =n J a W (t-2) = n. 

We use this argument / times to get the result. ■ 
Lemma 7: If the dissemination protocol is in the steady 
state, t > T s , we could not have H u (t) — H v (t) unless nodes 
u and v have the same set of ancestors at some I level above 
in the network. 

Proof: Because t > T s , we have d u = dim(IT M ) < n and 
d v = dim(IT t ,) < n. Let us assume H u (t) = H v (t) = IT so 
we have d = d u = d v . From the Corollary [5] we can write 

n pl(u) («-o = n p(W (t-z) = n, 

for every I > 1. Increasing I, two cases may happen. First, 
either P l (it) or P l (v) contains the source node S that results 
in dim(II P ! t u ) {t — 0) = n or dim(n P ((„)(i — l)) = n which 
is a contradiction since d < n. Second, nodes u and v have 
the same set of ancestors at some level I. ■ 

Up to here, we have shown that assuming the dissemination 
protocol is in the steady state the subspaces of two arbitrary 
nodes are equal only if they have the same ancestors at some 
level above in the network. The following result, Theorem [5] 
states sufficient conditions that make the nodes' subspace 
different for dissemination Algorithm III. II 

Theorem 5: Suppose two arbitrary nodes u and v have the 
same set of parents P l = P l (u) = P l (v) at some level I. The 
following conditions are sufficient so that the dissemination 
Algorithm III. II satisfies condition (|7]0: 

c u = min-cut(P' , u) < min-cut(S', P ) = c p , 
c v = min-cut(P ( , v) < min-cut(S', P ) = c p . 

Proof: Consider the set of nodes in P\ From the defi- 
nition we know that there exists at least one path of length I 
from each node in P l to the node u. But also there might exist 

6 Note that the min-cut to node u, c u = min-cut(5, u), equals c u = 
min{c„, c p }. 
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paths of length less than I from some nodes in P l to u. If this 
is the case, because the topology is a directed acyclic graph, 
we can find a subset P 1 of the nodes in P l such that it forms a 
cut for the node u and the shortest path from each node in P' 
to u is I; see Figure|3] Moreover, we have min-cut(5, P') = c p 
and min-cut(P', u) — c u . 

Now assume that P' = {px, . . . ,p k } such that r pi < • • • < 
r Pk . Let ai, . . . ,dk, be the accumulative min-cut from S to 
each node in P 1 . By this we mean that ax = c pi and a 2 is 
the amount of increase in the min-cut from S by adding node 
P2 and so on. We similarly consider the accumulative min-cut 
values from pi to u and denote these by b\, ... , So we 

have a j = c p and Sj=i & j = Cu- 

From definition of the waiting times (Definition [TJ we can 



write 



dp'(n) > ax + 1, 

dp'(r 2 ) > dp'(ri) + (r 2 - n)ai + a 2 , 

fc-i 

dp'(T k ) > d P /(T fc „i) + (r fc - r fc -i) a,- + a k . 



3=1 



Then we have 




dpi(r k ) > d P ,(T k ) 

fc-l k 

> (t 2 - n)oi H h (u - rfc-i) aj + + 1. (8) 

i=i i=i 

For d„ we can also write 

rf«(n +1) < bx, 

d u (T2 +1) < d u (rx + I) + (r 2 - n) min[ai,6i] + 6 2 , 

fc-i fe-i 

dufa +/) < d„(r fe _i) + (T k - r k - 1) min[^2 a 3 , ^bj] + b k , 

i=i i=i 

or 

rf«(T fc + /) < (t 2 - r 2 ) min[ai, 6i] 

fc-l fc-l A: 

+ h (r fe - r fc _i) min[^ ^ &y] + ^ 6j. (9) 

j=i j=i i=i 

From ((HJ, (O and the theorem assumptions we conclude that 
d u (T k + I) < dpi (r k ). Now for At timeslots later we write 

(a) 

d u (T k + l + At)< d u (r k +l) + c u At 

(b) 

< dpi {r k ) + CpAt 

^d P i(r k + At), 

where (a) is true because u receives packets from P l with 
rate at most c u ; (b) is true because d u (T k + I) < d P i(T k ) 
and c u < c p ; and finally (c) is true because after r k all of 
the nodes in P' receive packets at rate equal to their min-cut 
which means that P 1 (the same is true for P l ) receives packets 
at rate equal to its min-cut c p . 

The same inequality holds for the dimension of 
ITj(Tfc + / + At). Thus for time t > r k + I we cannot 
have ITpi (t — I) = H u {t) and ITpi (t — I) = H v (t) if c u < c p 
and c v < c p . So using Corollary [5] we are done. ■ 



Fig. 3. Sets used in the proof of Theorem \5\ the set P(u) contains the 
parents of node u at distance 1 = 1; the set P 2 (u) contains the set of parents 
at distance I = 2; while P 1 is the subset of P 2 (u) at distance no less than 
/ = 2. 



Intuitively, what Theorem [5] tell us is that, if for a node u 
there exists a path that does not belong in any cut between the 
source and another node v, then nodes u and v will definitely 
have distinct subspaces. The only case where nodes u and 
v may have the same subspace is, if they have a common 
set of parents, a common cut. Even then, they would need 
both of them to receive all the innovative information that 
flows through the common cut at the same time. Note that the 
condition of Theorem [5] are also necessary for identifiably for 
the special case of tree topologies, such as the topology in 
Figure |2] 

C. Practical Considerations 

We here argue that our proposed scheme can lead to a 
practical protocol, where nodes passively collect information 
during the dissemination, and send once a small amount of 
information to the central node in charge of the topology 
inference. In particular, we assume that the nodes follow 
the information dissemination protocol and at some point the 
central node query them to report the subspaces they gather 
at a specific^ time t. 

We now calculate the communication cost (total number 
of bits required to be transmitted to a central node) of 
the proposed passive inference algorithm. Each node has to 
transmit at most 2Ai(G) subspaces to the central node where 
A;(G) is the maximum in-degree of nodes in the network. 
There are $ nodes in the network so 2i9Ai(G) subspace have 
to be transmitted. The total number of subspaces of II5 (which 
itself is an n-dimensional space) is 



E 

i=i 



n 



i(n—i) 



74 



w E r ~ q ' 

'l i=l 

where ["] is the Gaussian number, the number of i- 
dimensional subspaces of an n-dimensional space. To approx- 
imate the Gaussian number we use ll32l Lemma 1]; note that 
the approximation holds for large q. 

7 We assume the query is send before time t actually occurs; Also note that 
if the number of source packets n is much larger than the min-cut to each 
node, and if we have an estimate for A;(G), a central node can with high 
probability select at time t in steady state. A node can also send a feedback 
message to inform the central node if it is not at steady state at time t. 
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So to encode one of the subspace of fig we need approx- 

2 

imately ^- log 2 q bits. As a result, the total number of bits 
need to be transmitted to the central node is at most 



Clearly, the complexity depends on the size of n, the 
number of packets that the source transmits. In our work we 
assume that n is large enough, so that the network enters in 
steady state; on the other hand, other considerations such as 
decoding complexity at network nodes, would require n to take 
moderate values. Note that, for our algorithm to work, (i.e., to 
sample the network while in the steady state) we only require 
that n = 2/3c max £>(G) (Corollary [3), where /3 > 1 is some 
constant that determines how many time slots the network is 
in the steady state. If n has such a size, the maximum number 
of bits that need to be transmitted per node (communication 
cost per node) is 

-^com-cost/ND 

«2/3 2 cL x I?(G) 2 A i (G)log 2(Z bits. 

In the above equation (3, c max , and A;(G) are some constants. 
The only parameter that depends on the network size is 
D(G). However for the most of practical content distribution 
networks the longest path of network is kept small to ensure 
a good connectivity between nodes in the network (see for 
example ll34l ). 

To give a specific example for a possible communication 
cost, let us consider a practical scenario where q = 2 8 , 
Cmax = 1, /3 2 = 5, Ai(G) = 5, and D(G) = 10. Then 
we have i? CO m-cost/nd ~ 4 kilobytes. In contrast, in a practical 
dissemination scenario (ex. of video) we would disseminate a 
large number of information packets each possibly as large 
as a few megabytes; thus the overhead of the topological 
information would not be significant. 

V. Locating Byzantine Attackers 

In this section we explore a problem that is dual to topology 
inference: given complete knowledge of the topology, we 
leverage subspace properties to identify the location of a 
malicious Byzantine attacker. 

In a network coded system, the adversarial nodes in the 
network disrupt the normal operation of the information flow 
by inserting erroneous packets into the network. This can be 
done by inserting spurious data packets into their outgoing 
edges. One way in which these erroneous packets can be 
prevented from disrupting information flow is by reducing the 
transmission rate to below the min-cut of the network, and 
using the redundancy to protect against errors; ll20l . iBTl . \22\. 
One such technique, using subspaces to code information was 
proposed in ifTTl . In this approach, the source sends a basis 
of the subspace corresponding to the message. In the absence 
of errors, the linear operations of the intermediate nodes do 
not alter the sent subspace, and hence the receiver decodes the 
message by collecting the basis of the transmitted subspace. 
A malicious attacker inserts vectors that do not belong in 
the transmitted subspace. Therefore, if the message codebook 
uses subspaces that are "far enough" apart (according to an 
appropriately defined distance measure), then one can correct 



these errors IfTTl . Note that in this technique, we do not need 
any knowledge of the network topology for the error correction 
mechanism. All that is needed is that the intermediate nodes 
do not alter the transmitted subspace (which can be done if 
they do linear operations). 

The approach of this section to locating adversaries uses the 
framework developed in the previous sections, where it was 
shown that under randomized network coding, the subspaces 
gathered by the nodes of the network provide information 
about the topology. Therefore, the basic premise in this section 
is to use the structure of the erroneous subspace inserted by 
the adversary to reveal information about its location, when 
we already know the network topology. 

A. Problem Formulation 

Consider a network represented as a directed acyclic graph 
G = (V,E). We have a source, sending information to r 
receivers, and one (or more) Byzantine adversaries, located 
at intermediate nodes of the network. We assume complete 
knowledge of the network topology, and consider the source 
and the receivers to be trustworthy (authenticated) nodes, that 
are guaranteed not to be adversaries. 

Suppose source S sends n vectors, that span an n- 
dimensional subspace II5 of the space F^, where we assume 
q ^> 1. In particular, in this section we will consider (without 
loss of generality) subspace coding, where II5 belongs to a 
codebook C, IT5 G C designed to correct network errors and 
erasures IfTTl . 

In the absence of any adversaries in the network each 
receiver Ri, i = 1, . . . , r, can decode the exact space IT5. Now 
assume that there is an adversary, Eve, who attacks one of the 
nodes in the network by combining a <5-dimensional subspace 
IT e with its incoming space and sending the resulting vectors 
to its children. Then the receiver Ri collects some linearly 
independent vectors that span a subspace IIr ( . We can write 

u Rl =Wi(n s + n e ), 

where 'Hi (IT) is a linear operator. This operator models the 
linear transformation that the network induces on the inserted 
source and adversary packets. 

We assume that the receiver is able to at least detect that a 
Byzantine attack is under way. Moreover, we assume that the 
receiver is able to decode the subspace IT5 that the source has 
sent. This might be, either because the receiver has correctly 
decoded the sent message (i.e., using code construction from 
IfTTl ). or, because after detecting the presence of an attack has 
requested the source subspace through a secure channel from 
the source node. 

We can restrict the Byzantine attack in several ways, de- 
pending on the edges where the attack is launched, the number 
of corrupted vectors inserted, and the vertices (network nodes) 
that the adversary has access to. In this section we will 
distinguish between the cases where 

I. there is a single Byzantine attacker located in a vertex 

of the network, and 
II. there are multiple independent attackers, located on 

different vertices, that act without coordinating with each 

other. 
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We assume that each attacker located on a single vertex is able 
to corrupt any outgoing edges by inserting arbitrary erroneous 
information. However, in this work we only consider the case 
where the attackers inject independent information without any 
coordination among themselves. 

We are interested in understanding under what conditions 
we can uniquely identify the attacker's location (or, up to 
what uncertainty we can identify the attacker), under the above 
scenarios. 

B. The Case of a Single Adversary 

In this section we focus on the case where we want to locate 
a Byzantine adversary, Eve, controlling a single vertex of the 
network graph. 

In W-B 1 1 we illustrate the limitation of using only the 
information the receivers have observed along with the knowl- 
edge of the topology, to locate the adversary. This motivates 
requiring additional information from the intermediate nodes 
related to the subspaces observed by them. In W-B2I we 
show that such additional information allows us to localize 
the adversary either uniquely or within an ambiguity of at 
most two nodes. 

1) Identification using only Topological Information: In 
order to illustrate the ideas, we will examine the case where the 
corrupted packets are inserted on a single edge of the network, 
say edge ca- The extension to the cases where multiple edges 
get corrupted is easy. 

Since each receiver R knows the subspaces {IT^} it has 
received from its |In(i?)| parents, it knows whether what it 
received is corrupted or not (a subspace of IT5 or not). Using 
this, we can infer some information regarding topological 
properties that the edge ca should satisfy. In particular we 
have the following result, Lemma [8] 

Lemma 8: Let P e denote the set of paths^l starting from 
the source and ending at edge e. Then, if £c is the set of 
incoming edges to receivers that bring corrupted packets, while 
£s the set of incoming edges to receivers that only bring 
source information, the edge &a belongs in the set of edges 
£a, with 

^=fn^-u4 

Proof: If R receives corrupted vectors from an incoming 
edge e then there exists at least one path that connects to 
e. Then is part of at least one path in P e . 

Conversely, if a receiver R does not receive corrupted 
packets from an incoming edge e, then &a does not form part 
of any path in P e . That is, there does not exist a path that 
connects ca to e. ■ 

The following example illustrates this approach. 

Example 3: Consider the network in Figure @] and assume 
that Ri receives corrupted packets from edge DR\ and uncor- 
rupted packets from AR\, while R2 receives only uncorrupted 
packets. Then £a = {DR%} and the attacker is located on 
node D. ■ 

8 In the following we are going to equivalently think of P e as the set of all 
edges that take part in these paths. 




Fig. 4. The source S distributes packets to receivers Ri and i?2. 

In Example [3] we were able to exactly identify the location 
of the adversary, because the set £a contained a single 
edge, and node R\ is trustworthy. It is easy to find network 
configurations where £a contains multiple edges, or in fact 
all the network edges, and thus we can no longer identify the 
attacker. The following example illustrates one such case. 

Example 4: Consider the line network shown in Figure [5] 
Suppose the attacker is node A. If the receiver R sees a 
corrupted packet, then using just the topology, the attacker 
could be any of the other nodes in the line network. This 
illustrates that just the topology and receiver information could 
lead to large ambiguity in the location of the attacker. ■ 

Therefore, Example [4] motivates the ideas examined in 
S 1V-B2I which obtain additional information and utilize the 
structural properties of the subspaces observed. 

2) Identification using Information from all Network Nodes: 
We will next discuss algorithms where a central authority, 
which we will call controller, requests from all nodes in the 
network to report some additional information, related to the 
subspaces they have received from their parents. The adversary 
could send inaccurate information to the controller, but the 
other nodes report the information accurately. Our task is to 
design the question to the nodes such that we can locate the 
adversary, despite its possible misdirection. 

The controller may ask the nodes of the following types of 
information, listed in decreasing order of complexity: 

(i) 

Information 1: Each node v sends all subspaces ILi it 
has received from its parents, where U v = 2~2iep! v ) ^1^- 
Information 2: Each node v sends a randomly chosen 
vector from each of the received subspaces ( | In(u) 
vectors in total). 
Information 2 is motivated by the following well-known 
observation, see Lemma [2] let ITi and II2 be two subspaces 
of F™, and assume that we randomly select a vector y from 
ITi. Then, for q ^> 1, y £ II2 if and only if 111 C II2. Thus, 
a randomly selected vector from II„ allows to check whether 
FL C lis or not. 

In fact, we will show in this section that for a single 
adversary it is sufficient to us^| Information 2, and classify 
the edges of the network by simply testing whether the 
information flowing through each edge is a subspace of IT5 
or not (i.e., is corrupted or not). 

Theorem 6: Using Information 1, by splitting the network 
edges into corrupted and uncorrupted sets, we can narrow the 

'Using Information 2 these statements are made with high probability, i.e., 
the probability goes to one as field size q — > 00. 
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Fig. 5. The source S sends information to receiver R over a line network. 

location of the adversary up to a set of at most two nodes. 
With Information 2, the same result holds w.h.p. 

Proof: The network is a directed acyclic graph, so we 
can impose a partial order on the edges of the graph, such that 
e\ > e 2 if ei is an ancestor edge of e 2 (i.e., there exists a path 
from e\ to e 2 ). Then having Information 1 or Information 2, 
we can divide the edges of the network into two sets: the set 
of edges Ec through which are reported to flow corrupted 
subspaces, and the remaining edges Eg through which the 
source information flows so we have E = Eg U Ec and Es n 
Eq — 0- Note that all the outgoing edges from the source 
belong in Eg, 

Nodes in the network perform randomized network coding 
so every node that receives corrupted information on at least 
one of its incoming edges makes all of the outgoing edges 
polluted w.h.p. Let t v be the number of corrupted outgoing 
edges of a node v where we have 1 < t v < | Out(u)|. For 
each node v that is not an adversary we have either t v = or 
t v = | Out(u)|. 

Now, to prove the theorem we consider the following 
possible cases. 

1) If the adversary Eve corrupts tA outgoing edges where 
1 < tA < | Out (A) | we can identify the node she has 
attacked uniquely because its behavior is different from 
all other nodes. 

2) If she corrupts all of its outgoing edges, tA = | Out(A)|, 
then she can fraud us by declaring that one of the node's 
incoming edges is corrupted. If A declares more than 
one of the incoming edges as corrupted we can find its 
location uniquely. 

3) She can also corrupt only one of its outgoing edges, 
tA = 1, and pretends that its children is in fact the 
adversary by declaring all of its incoming edges bring 
non-corrupted information. She cannot declare that any 
of its incoming edges are polluted since then we may 
find its location uniquely. 

In all of the above cases the adversary is on the boundary 
of two sets Eg and Ec and the ambiguity about its location 
is at most withing a set of two vertices where this set contains 
those two vertices that are connected by the corrupted edge 
with highest order among all corrupted edges (recall that we 
can compare all of the corrupted edges using the imposed 
partial order). ■ 

C. The Case of Multiple Adversaries 

In the case of a single adversary, it was sufficient to divide 
the set of edges into two sets, Eg and Eq, as described in the 
previous section. In the presence of multiple adversaries, this 
may no longer be sufficient. An additional dimension is that 
realistically, we may not know the exact number of adversaries 
present. In the following, we discuss a number of algorithms, 
that offer weaker or stronger identifiability guarantees. 



1) Identification using only Topological Information: The 
approach in i lV-Bll can be directly extended in the case 
of multiple adversaries, but again, offers no identifiability 
guarantees. 

Example 5: Consider again the network in Figure 2J and 
assume that Ri receives corrupted packets only from edge 
DRi while R 2 receives corrupted packets only from edge 
DR 2 . Then £ A = {AD, CD, DR ll DR 2 } and (depending on 
our assumptions) we may have, 

- a single adversary located on node D, 

- two adversaries, located on nodes A and C, 

- two adversaries, located on nodes A and D, or nodes C 
and D, or 

- three adversaries, located on nodes A, C, and D. 

m 

2) Identification using Splitting: Similarly to W-B2I using 
Information 1 or Information 2, we can divide the set of 
edges into two sets Eg and Ec, depending on whether the 
information flowing through each edge belongs in II5 or 
not. Depending on the network topology, we may be able to 
uniquely identify the location of the attackers. However, this 
approach, although it guarantees to find at least one of the 
attackers (within an uncertainty of at most two nodes), does 
not necessarily find all the attackers, even if we know their 
exact number. 

To show this let us state the following definition. 

Definition 4: We say that node v is in the shadow of node 
A, if there exists a path that connects every incoming edge of 
v to a corrupted outgoing edge of A. 
Then we have the following result. 

Lemma 9: By splitting the network edges into two sets Eg 
and Ec we cannot identify adversarial nodes that are in the 
shadow of an adversary A. 

Proof: This is because if an attacker is in the shadow of 
another attacker, it may corrupt only already corrupted vectors 
and thus not incur a detectable effect. So we cannot distinguish 
between an attacker and a normal node that are in the shadow 
of A. M 

The following example illustrates these points. 

Example 6: For the example in Figure [4] assume that each 
attacker corrupts all its outgoing edges, and consider the 
following two situations: 

1) Assume that nodes A and C are attackers. If A 
reports truthfully while C lies we get Ec = 
{AD,AR 1 ,DR 1 ,DR 2 ,BC,CR 2 ,CD}, which allows 
to identify the attackers. 

2) Assume that nodes B and D are attackers. Then we say 
that node D is in the shadow of node B, as it corrupts 
only already packets corrupted by B. Indeed, if Ec = 
{SB, BA, BC, AD, AR U DR X , DR 2 , BC, CR 2 ,CD}, 
knowing that the source is trustworthy, we can infer that 
node B is an attacker. However, any of the nodes A, 
C, and D can equally probably be the second attacker. 
All these nodes are in the shadow of node D. 

■ 

Theorem 7: Using Information 1 it is possible to narrow 
down the location of those adversaries that have the highest 
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order in the network using the splitting method. The same 
result holds for Information 2 w.h.p. 

Proof: As stated in the proof of Theorem|6]we can impose 
a partial order on the edges of the network graph. Then, by 
using Information 1 or Information 2 we may split the network 
edges into two sets Eg and Ec- 

Because every node in the network performs randomized 
network coding, there are only two possibilities for each 
adversary to corrupt its outgoing edges and report subspaces 
for its incoming edges such that it is not located uniquely. 
These are as follows. 

1) She corrupts some (or all) of its outgoing edges but 
reports its incoming edges as uncorrupted. 

2) She corrupts all of its outgoing edges and reports some 
(at least one) of its incoming edges as corrupted. 

Now, let us consider the set of all the corrupted edges that 
have highest order with respect to other corrupted edges and 
cannot be compared against each other. For each of the above 
cases there should be at least one adversary connected to every 
edge in this set. ■ 

3) Identification using Subset Relationships: In this subsec- 
tion we develop a new algorithm to find the adversaries which 
is based on Information 1. 

For each node u G V, let P(u) = {ui, . . . , u Pu } denote 
the set of parent nodes of u. We are going to treat P(u) as a 
super node, and use the notation Tlpr u ) — Sf=i n Mi for the 

union of the subspaces of all nodes in P(u). Also recall that 

(it) 

Hi, ' denotes the subspace received by node v from node u. 
Our last algorithm checks, for every node u G V, whether 

ni u) C n P(u) Vv G V : e uv G E. (10) 

Then we have the following result, Theorem [8] 

Theorem 8: If the pairwise distance between adversaries is 

greater than two, it is possible to find the exact number as 

well as the location of the attackers (within an uncertainty of 

parent-children sets) using the subset method. 

Proof: First, let us focus on a single adversary case where 

A G V is the node attacked by the adversary. Then we will 

generalize the idea for an arbitrary number of adversaries. 
If ( [Tol l is satisfied for all children of u, we know that node 

u is not an adversary. If the relationship is not satisfied, that is 
Hp(u) for at least one child of u, we consider node 

it as a potential candidate for being an adversary. For sure we 

know that 

ni A) £ n P(A) Vv G V : e Av G E, 

but depending on the subspace that the adversary reports, the 
relation ( fTQb may not be also satisfied for other nodes. Based 
on what the adversary reports there would be two possible 
cases. 

If the adversary pretends that it is a trustworthy node (just 
declares the received subspace from its parents) the above 
relation also fails for the children of A who receive corrupted 
subspaces. On the other hand, if the adversary tells the truth 
and declares its corrupted subspace, we have 

£ n P( „) Vu G V : uA G E. 



Thus the ambiguity set we have identified includes the ad- 
versary and its parents and/or its children depending on the 
adversary's report. 

Repeating this procedure for every node in the network, 
we can identify sets of potential adversaries. We know that 
depending on the adversaries action there exists ambiguity 
in finding their exact location. In fact in the worst case, the 
uncertainty is within a set of nodes including the adversary, its 
parents and its children. So if the distance between adversaries 
is greater than two, the "uncertainty" sets do not overlap. In 
this case we can easily distinguish between different adver- 
saries. ■ 

This procedure allows to identify adversaries (within the 
mentioned parent-children ambiguity set), even if one is in the 
shadow of another, and even if we do not know their exact 
number, provided they are "far enough" in the network to be 
distinguishable. 

VI. Practical Implications for Topology 
Management 

In HIV\ we demonstrated that using subspaces of all nodes, 
we can infer the network topology under certain conditions. In 
this section, we will show that even from what a single node 
observes, it is possible to get some information regarding the 
bottlenecks and clustering in the network. 

Leveraging this observation in the context of P2P networks, 
we propose algorithms that use this information in a distributed 
peer-initiated manner to avoid bottlenecks and clustering. 

A. Problem Statement and Motivation 

In peer-to-peer networks that employ network coding for 
content distribution (see for example Avalanche J5], J4]) 
we want to create and maintain a well-connected network 
topology, to allow the information to flow fast between the 
nodes; however, this is not straightforward. Peer-to-peer are 
very dynamically changing networks, where hundreds of nodes 
may join and leave the network within seconds. All nodes in 
this network are connected to a small number of neighbors 
(four to eight). An arriving node is allocated neighbors among 
the active participating nodeS which accept the solicited 
connection unless they have already reached their maximum 
number of neighbors. As a result, nodes that arrive at around 
the same time tend to get connected to each other, since they 
are all simultaneously available and looking for neighbors. 
That is, we have formation of clusters and bottlenecks in the 
network. 

To avoid this problem, one method adopted in protocols is to 
ask all nodes to periodically drop one neighbor and reconnect 
to a new one among an active peers list. This randomized 
rewiring results in a fixed average number of reconnections per 
node independently of how good or bad is the formed network 
topology. Thus to achieve a good, on the average, performance 
in terms of breaking clusters, it entails a much larger number 
of rewiring than required, and unnecessary topology changes. 

10 This is usually done by a central node which we call it (following 
Avalanche) "registrar". This is the central authority that keeps the list of all 
nodes in the network and gives every new node a set of neighbors. 
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Fig. 6. The source S distributes packets to the peers A, B, C and D over 
the overlay network (a), that uses the underlying physical network (b). 

An alternative approach is to have peers initiate topology 
rewirings when they detect they are in a cluster. Clearly 
a central node could keep some structural information, i.e., 
keep track of the current network topology, and use it to 
make more educated choices of neighbor allocations. However, 
the information this central node can collect only reflects 
the overlay network topology, and is oblivious to bandwidth 
constraints from the underlying physical links. Acquiring 
bandwidth information for the underlying physical links at the 
central node requires costly estimation techniques over large 
and heterogeneous networks, and steers towards a centralized 
network operation. We will argue that such bottlenecks can 
be inferred almost passively in a peer-initiated manner, thus 
alleviating these drawbacks. 

Here, we will show that the coding vectors the peers receive 
from their neighbors can be used to passively infer bottleneck 
information. This allows individual nodes to initiate topology 
changes to correct problematic connections. In particular, peers 
by keeping track of the coding vectors they receive can detect 
problems in both the overlay topology and the underlying 
physical links. The following example illustrates these points. 

Example 7: Consider the toy network depicted in Fig- 
ure |6ja) where the edges correspond to logical (overlay 
network) links. The source S has n packets to distribute to 
four peers. Nodes A, B and C are directly connected to the 
source S, and also among themselves with logical links, while 
node D is connected to nodes A, B and C. In this overlay 
network, there exist three edge-disjoint paths between source 
and any other nodes. 

Assume now (as shown in Figure|6jb)) that the logical links 
SA, SB, SC share the bandwidth of the same underlying 
physical link, which forms a bottleneck between the source and 
the remaining nodes of the network. As a result, assume the 
bandwidth on each of these links is only 1/3 of the bandwidth 
of the remaining links. A central node (registrat), even if 
it keeps track of the complete logical network structure by 
querying each node asking about its neighbors, is oblivious 
to the existence of the bottleneck and the asymmetry between 
the link bandwidths. 

Node D however, can infer this information by observing 
the coding vectors it receives from its neighbors A, B and 
C. Indeed, when node A receives a coded packet from the 
source, it will forward a linear combination of the packets it 
has already collected to nodes B and C and D. Now each 



of the nodes B and C, once they receive the packet from 
node A, they also attempt to send a coded packet to node D. 
But these packets will not bring new information to node D, 
because they will belong in the linear span of coding vectors 
that node D has already received. Similarly, when nodes B 
and C receive a new packet from the source, node D will 
end up being offered three coded packets, one from each of 
its neighbors, and only one of the three will bring to node D 
new information. ■ 
More formally, the coding vectors nodes A, B and C will 
collect will effectively span the same subspace; thus the coded 
packets they will offer to node D to download will belong in 
significantly overlapping subspaces and will thus be redundant 
(we formalize these intuitive arguments in i jVI-Bt . Node D can 
infer from this passively collected information that there is a 
bottleneck between nodes A, B, C and the source, and can 
thus initiate a connection change. 

B. Theoretical Framework 

Here we use the same notations introduced in SJII] For 
simplicity we will assume that the network is synchronous^ 
Nodes are allowed to transmit linear combinations of their 
received packets only at clock ticks, at a rate equal to the 
adjacent link bandwidth. 

Now we use the framework of fllTTl to investigate the 
information that we can obtain from the local information of a 
node's subspace. From notations defined in [Jn] we know that 
for an arbitrary node v we can write 

u v (t)= £ n«(t). 

ieP(v) 

We are interested in understanding what information we can 
infer from these received subspaces Hi,, i £ P( v )i about 
bottlenecks in the network. For example, the overlap of 
subspaces from the neighbors reveals some information about 
bottlenecks. Therefore, we need to show that such overlaps 
occur due to topological properties and not due to particular 
random linear combinations chosen by the network code. 

(i) 

Let us assume that the subspaces Ili, ' a node v receives 
from its set of parents P(v) have an intersection of dimension 
d. Then we have the following observations. 

Observation 1: The subspaces Tli l \ i £ P(v), of the neigh- 
bors have an intersection of size at least d (see Corollary [TJ. 

Observation 2: The min-cut between the set of nodes P(v) 
and the source is smaller than the min-cut between the node 
v and set P(v) (see Theorem |2). 

In the following, we will discuss algorithms that use such 
observations for topology management. 

C. Algorithms 

Our peer-initiated algorithms for topology management con- 
sist of three tasks: 

1) Each peer decides whether it is satisfied with its con- 
nection or not, using a decision criterion. 

"This is not essential for the algorithms but simplifies the theoretical 
analysis. 
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2) An unsatisfied peer sends a rewiring request, that can 
contain different levels of information, either directly to 
the registrat, or to its neighbors (these are the only nodes 
the peer can communicate with). 

3) Finally, the registrat, having received rewiring requests, 
allocates neighbors to nodes to be reconnected. 

The decision criterion can capitalize on the fact that over- 
lapping received subspaces indicate an opportunity for im- 
provement. For example, in the first algorithm we propose 
(Algorithm 1), a node can decide it is not satisfied with a 
particular neighbor, if it receives fc > 0, non-innovative coding 
vectors from it, where fc is a parameter to be decided. Then 
it has each unsatisfied node directly contact the registrat and 
specify the neighbor it would like to change. The registrat 
randomly selects a new neighbor. This algorithm, as we 
demonstrate through simulation results, may lead to more 
rewirings than necessary: indeed, all nodes inside a cluster 
may attempt to change their neighbors, while it would have 
been sufficient for a fraction of them to do so. 

Our second algorithm (Algorithm 2) uses a different de- 
cision criterion: for every two neighbors u and v, each peer 
computes the rate at which the received joint space II„ + II„ 
and intersection space II„ P\I1 V increases. If the ratio between 
these two rates becomes greater than a threshold T, the node 
decides it would like to change one of the two neighbors. 
However, instead of directly contacting the registrat, it uses a 
decentralized voting method that attempts to further reduce 
the number of reconnections. Then the registrat randomly 
selects and allocates one new neighbor for the nodes have 
sent rewiring request. 

Our last proposed algorithm (Algorithm 3), while still peer- 
initiated and decentralized, relies more than the two previous 
ones in the computational capabilities of the registrat. The 
basic observation is that, nodes in the same cluster will not 
only receive overlapping subspaces from their parents, but 
moreover, they will end up collecting subspaces with very 
small distance (this follows from Theorem |2] and Corollary [T] 
and is also illustrated through simulation results in WI-DI see 
Figure [8]). Each unsatisfied peer v sends a rewiring request 
to the registrat, indicating to the registrat the subspace n„ it 
has collected. A peer can decide it is not satisfied using for 
example the same criterion as in Algorithm 2. 

The registrat waits for a short time period, to collect requests 
from a number of dissatisfied nodes. These are the nodes 
of the network that have detected they are inside clusters. It 
then calculates the distance between the identified subspaces 
to decide which peers belong in the same cluster. While 
exact such calculations can be computationally demanding, 
in practice, the registrat can use one of the many hashing 
algorithms to efficiently do so. Finally the registrat breaks the 
clusters by rewiring a small number of nodes in each cluster. 
The allocated new neighbors are either nodes that belong in 
different clusters, or, nodes that have not send a rewiring 
request at all. 

We will compare our proposed algorithms against the 
Random Rewiring currently employed by many peer-to-peer 
protocols (e.g., see (3), |4j, (34]). In this algorithm, each 
time a peer receives a packet, with probability p contacts the 



registrat and asks to change a neighbor. The registrat randomly 
selects which neighbor to change, and randomly allocates a 
new neighbor from the active peer nodes. 

















r \ / /cgeQ 
riL \ V 





Fig. 7. A sample of topology with three clusters: cluster 1 contains nodes 
1-10, cluster 2 nodes 11-20 and cluster 3 nodes 21-30. 

D. Simulation Results 

For our simulation results we will start from randomly 
generated topologies similar to Figure [7] that consists of 30 
nodes connected into three distinct clusters. The source is 
node 1, and belongs in the first cluster. The bottleneck links 
are indicated with arrows (and thus indicate the underlying 
physical link structure). Our first set of simulation results 
depicted in Figure [8] show that the subspaces within each 
cluster are very similar, while the subspaces across clusters 
are significantly different, where we use the distance measure 
Ds(-, •) defined in |0. These results indicate for example 
that knowledge of these subspaces will allow the registrat to 
accurately detect and break clusters (Algorithm 3). 

Our second set of simulation results considers again topolo- 
gies with three clusters: cluster 1 has 15 nodes and contains the 
source, cluster 2 has also 15 nodes, while the number of nodes 
in cluster 3 increases from 15 to 250. During the simulations 
we assume that the registrat keeps the nodes' degree between 
2 and 5, with an average degree of 3.5. All edges correspond 
to unit capacity links. 

We compare the performance of the three proposed algo- 
rithms in Wl-Cl with random rewiring. We implemented these 
algorithms as follows. For random rewiring, every time a 
node receives a packet it changes one of its neighbors with 
probability p = gjL. For Algorithm 1, we use a parameter 
of fc = 10, and check whether the non-innovative packets 
received exceed this value every four received packets. For 
Algorithm 2, every node checks each received subspaces every 
four received packets using the threshold value T = 1. Finally 
for Algorithm 3, we assume that nodes use the same criterion 
as in Algorithm 2 to decide whether they form part of a cluster, 
again with T = 1. Dissatisfied nodes send their observed 
subspaces to the registrat. The registrat assigns nodes u and v 
in the same cluster if ds(H u ,H v ) < 7. 

Table Q] compares all algorithms with respect to the average 
collection time, defined as the difference between the time a 
peer receives the first packet and the time it can decode all 
packets, and averaged over all peers. All algorithms perform 
similarly, indicating that all algorithms result in breaking the 
clusters. It is important to note that the average collection 
time is in terms of number of exchanges needed and does not 
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account for the delays incurred due to rewiring. We compare 
the number of such rewirings needed next. 

Figure [9] plots the average number of rewirings each algo- 
rithm employs. Random rewiring incurs a number of rewirings 
proportional to the number of P2P nodes, and independently 
from the underlying network topology. Our proposed algo- 
rithms on the other hand, adapt to the existence and size 
of clusters. Algorithm 3 leads to the smallest number of 
rewirings. Algorithm 2 leads to a larger number of rewirings, 
partly due to that the new neighbors are chosen randomly 
and not in a manner that necessarily breaks the clusters. 
The behavior of algorithm 1 is interesting. This algorithm 
rewires any node that has received more than k non-innovative 
packets. Consider cluster 3, whose size we increase for the 
simulations. If k is small with respect to the cluster size, then 
a large number of nodes will collect close to k non-innovative 
packets; thus a large number of nodes will ask for rewirings. 
Moreover, even after rewirings that break the cluster occur, 
some nodes will still collect linearly dependent information 
and ask for additional rewirings. As cluster 3 increases in size, 
the information disseminates more slowly within the cluster. 
Nodes in the border, close to the bottleneck links, will now be 
the ones to first ask for rewirings, long before other nodes in 
the network collect a large number of non-innovative packets. 
Thus once the clusters are broken, no new rewirings will be 
requested. This desirable behavior of Algorithm 1 manifests 
itself for large clusters; for small clusters, such as cluster 2, the 
second algorithm for example achieves a better performance 
using less reconnections. 



TABLE I 
Average Collection Time 





400 






rewii 


300 


o 




jmber 


200 






(D 
D) 


100 


CD 




< 


0* 



Random 
Algol 

■ Algo2 

■ Algo3 




100 150 200 250 300 350 400 450 500 550 
Total number of P2P nodes 

Fig. 9. Average number of rewirings, for a topology with three clusters: 
cluster 1 has 15 nodes, cluster 2 has 15 nodes, while the number of nodes 
in cluster 3 increases from 20 to 250 as described in Table U 

VII. Conclusions and Discussion 

In this work we explored the properties of subspaces each 
node collects in networks that employ randomized network 
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coding and found that there exists an intricate relationship 
between the structure of the network and these properties. 
This observation led us to utilize these relationships in several 
different applications. As the first application, we studied the 
conditions under which we can passively infer the network 
topology during content distribution. We showed that these 
conditions are not very restrictive and hold for a general 
class of information dissemination protocols. As our second 
application, we focused on locating Byzantine attackers in the 
network. We studied and formulated this problem and found 
that for the single adversary we can identify the adversary 
within an uncertainty of two nodes. For the case of mul- 
tiple adversaries, we discussed a number of algorithms and 
conditions under which we can guarantee identifiability. For 
our last application, we investigated the relation between the 
bottlenecks in a logical network and the subspaces received 
at a specific network node. We leveraged our observations to 
propose decentralized peer-initiated algorithms for rewiring in 
P2P systems to avoid clustering in a cost-efficient manner, and 
evaluated our algorithms through simulations results. 

The applications studied in this paper demonstrate ad- 
vantages of using randomized network coding for network 
management and control, that are additional to throughput 
benefits. These are just a few examples and we believe that 
there exist a lot more applications where we can use the 
subspace properties developed in this work. We hope that these 
properties will become part of a toolbox that can be used to 
develop applications for systems that employ network coding 
techniques. 
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Appendix A 
Proofs 

Proof of ' Lemma \T} First, let us fix a basis for II5. Then 
choosing to vector uniformly at random from II5 is equivalent 
to choose an to x n matrix A uniformly at random from F g 
and construct II = (A) with respect to this fixed basis. 

It is well known {e.g., see |35|) that the number of different 
to x n matrices A with rank < k < min[m, n] over F g is 
equal to 

' 1 (1- 



N m , n (k) 4 q (m+n-k)k JJ 



*) 



So we can write 



' [dim(n) = k] = 



(1-3 



Nrn,n{k*) 



-k) 



Then using the Taylor series = l + e + e 2 + - ■ ■ for |e| < 1, 
choosing e = q^ 1 , we can write 

Pr[dim(n) =k] = g -(m-fc)(»-*)[l - 0{q- 1 )]. 

By setting k = min[m, n] we are done. ■ 
Proof of Lemma [2} The probability that all to vectors 
are in the intersection is 



,(di2— di)m 



which is of order O (q m ) provided that III ^ II2 , i.e., dn < 
(/,. ■ 
Proof of Lemma\3} Let v\, . . . , v m be the vectors chosen 
randomly from II5 to construct II, namely, we have II = 
(vi, . . . , v m ). Then construct the sequence of subspaces H(i), 
i = 0, . . . , m, as follows. First, set 11(0) = life and then define 
II(i) for i ^ recursively, H(i) = H(i — 1) + (v^. We also 
define d(i) = dim(II(i)), i = 0, . . . ,to. From Lemma [2] by 
choosing IT! = II5, H 2 = H(i — 1) and m = 1 we deduce 
that d(i) = d(i — 1) + 1 with probability l — O (q^ 1 ), unless 
d(i — 1) = n. 

Now we consider two cases. First, if to + k < n then we 
have dim(n + IIfc) = k + m or equivalently dim(Ilnllfc) = 
with high probability, i.e., 1 — Secondly, when to + 

k > n we have dirn(n+IIfe) = n with probability l—O (q 1 )- 
From Lemma Q] we have dim(II) = min[m, n] w.h.p. So we 
have dim(n n 11* ) = dim(II fe ) + dim(II) - dim(II fe U II) = 
k + min[m, n] — n. 

Combining these two cases we can write 

dim(II n life) — (k + min[TO, n] — n) + , 
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w.h.p., which completes the proof. ■ 
Proof of CorollaryU} Let us define II12 = IliPin^, where 
di2 = dim(rii2). Using Lemma [3] and taking II5 = EL and 
ITfc = II12, we have 

dim(Ili n II12) = min [du, (mi - (di - d i2 )) + ] , 

with probability 1 — O (q 1 )- Now, we can write 

P d u = a 
P du = a\ dim(fli n n 12 ) = (3 
+P \d 12 = a\ dim(ITi nn 12 )^ 



dim(rii n n i2 ) = p 
dim(n 1 nn 12 ) 



where d i2 = dim(ili fl n 2 ). Substituting /3 

min [di2, {mi — (di — di 2 )) + ] we obtain 

d u = a\ dim(n 1 n n 12 ) = /?] (1 - o (q- 1 )) + o (q- 1 ) . 

Selecting a properly and using Lemma [3] one more time, we 
get 



d\2 = a 



= 1-0 (q~ 



where a — min[/3, (m 2 — (d 2 — /3)) + ], which completes the 
proof. ■ 
Proof of Theorem^ To prove the theorem, it is sufficient 
to show that (0 is valid for one specific i with high probability. 
This is sufficient because if pi is the probability that IT is in 
general position with respect to each IL;, i — 1, . . . , r, then 
the probability that fl is in general position with the whole 
family is lower bounded by 1 — X)i=i(l — Pi)- 

Now by applying Lemma[3] we know that pi = l — O (q^ 1 ) 
which completes the proof. ■ 
Proof of Lemma® Here we assume that n is very large. 
Then in Corollary [3] we will derive a sufficient condition on 
the largeness of n. 

Let v be the node that has the longest path to the source S. 
Because of Definition Q] we can write T s < r v — 1. Then we 
may upper bound r v as follows 



r„ < 2 - 



max t u 

u£P(v) 



where P(v) is the set of parents of v. Now we can repeat the 
above argument until we reach the source S. So finally we 
have 

t v < 2D(G), 

which leads to the lemma's assertion. ■ 
Proof of Lemma [5} Let us write 

dim(7r„(i)nn„(. ? )) 

(a) 



(6) 



dim(7r u (i)n(n„(j)nn(o))) 

dim(7r. u (l) ri7r„(l)) 



min[d , (fc„(l) + k v (l) - do) + , k u (l),k v {l)] 
= (fc„(l) + fc„(l)-d ) + 
< fcu(l), 



where (a) follows because 7r„(l) C 11(0) and (c) is a result 
of Corollary 2. So Vj £ {l,...,t} we have tt u (1) ^ n t ,(j) 
which results in IT u (i) ^ Tl v (j), e {1,. ..,*}. By 

symmetry, we have the second assertion of the lemma, namely, 

n„(i) £TK(i), v<,j g 

Now, it only remains to check (b). We will prove this by 
induction. Obviously, IT(0)nlI t ,(l) = tt v (1). Suppose that we 
have 11(0) D II„(fc) = 7T„(1) where k < t then we show that 
it also holds for k + 1. 

We know that n v (l) C 11(0) n Il v (k + 1). To show that 
IT(0) fl IT„(fc + 1) C n v (l) we proceed as follows. Let w £ 
n(0) n U v (k + 1) then w G n(0) and w G U v (k + 1) 
^2^i 7T v (i). We may decompose 11; as 11; 
Wi G TT v (i). Then we notice that w^+i - 



w i where 

" - Et=i w * e 



Il(fc-l) and n(fc-l)n7r t ,(fc + l) = w.h.p. (by Lemma 3). 
So we conclude that Wk+i = which means w G ILj (k). This 
shows that w G 11(0) DH v (k) where by induction assumption 
we have w G 7r„(l) and we are done. ■ 
Proof of Corollary]?} Because we have IT u (0) ^ IL(j) 
then by Lemma 2 we have 7r a (l) ^ ^(j) w.h.p. So as a 
result we have H a (i) ^ n„(j — 1) G {1, . . . , t}. Because 
II & (j) C n v (j - 1) we conclude that U a (i) £ U b (j) Vi,j G 
{1, . . . ,£} w.h.p. By symmetry, we also deduce the other part 
of the corollary. ■ 

Appendix B 

Algebraic Model for Synchronous Networks 

In this appendix we employ an algebraic approach to 
analyze the dissemination protocol given in Algorithm III. II 
This approach is similar to lfT31 and h but differs in that we 
introduce memory into the coding process. 

We introduce memory as follows. Suppose we are interested 
in finding the transfer function between the source and an 
arbitrary node v. Let X be a n x £ matrix with rows the 
n packets (vectors) that the source wants to transmit to the 
receivers. We assume that dim((X)) = n. Let Y(i) G ¥^ xi 
be a matrix with rows the packets that pass through the £ edges 
of the network at time t. Let Z v (t) be the set of packets that 
node i! receives. Similarly to lfT31 . we will write state-space 
equations that involve these vectors; however, we will ensure 
that, at each time t, coding at each node occurs across all the 
packets that the node has received before time t. 

In each timeslot t, the source injects |Out(5)| packets 
into the network that are random linear combinations of the 
original source packets X. These linear combinations can be 



Out(S)|xn 



is a random 



captured as M(t)X, where M(t) G F, 
matrix. Intermediate network nodes will transmit packets on 
their outgoing edges depending on the network connectivity, 
and the state of the dissemination protocol. 

The network connectivity can be captured by the £ x £ 
adjacency matrix T of the labeled line graph of the graph 
G, defined as follows 



T ■ — 



1 hcad(ei) = tail(ej), 
otherwise. 



To model random coding over a field F g , we consider a 
sequence of random matrices F± 



(t) 



, -F^! which conform 
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to T. That is, the entries of these matrices have for i ^ j 
)ij = wherever Tij = and have random numbers 
from W q in all other places. 

The dissemination protocol dictates when a node can start 
transmitting packets, according to its waiting time (equiva- 
lently, when the outgoing edges of the node will have packets 
send through them). To capture this, we will use the step 
function u(t), 

u(t) ±i 1 * - °' 

[0 otherwise, 

and define the £ x £ diagonal matrix U(t) as, 

Vi G E : Uu(t) = tt (t - r taU(i ) - l) , 

where t v is the waiting time for node v. In this section we 
assume that the waiting times may have arbitrary values and 
we do not restrict them according to Definition [TJ 

Using the above definitions, the set of packets (vectors) that 
each node v receives in every time instant t > can be written 
as follows 



Y(t) = U(t) [AM(t)X 

Z v {t) = B v Y(t), 

(11) 

where Y(0) = 0. In the above, A G ¥ q x ° ut ( 5 ')l j s a mau -ix 
which represents the connection of node S to the rest of the 



In(«)|x« 



defines the 



network. In the same way matrix B v G F, 
connection of node v to the set of edges in the network. 

It is worth noting that although ( fTTT i is written for the 
packets transmitted on each edge, we can write the same set 
of equations for the coding vectors. 

Suppose we are interested in finding the output of such 
a system at some time instant T. We can rewrite the above 
equations by defining new matrices as follows. We can collect 
the source random operations as 



Mn 



M(l) 



e ¥ 



T\ Out(S)|xn 



M(T) 

For the states of system we define 
- Y(l) 



Y(T) 



G Ff 



We also define a new set of matrices which represent the input- 
output relation. Using matrix A we define the following matrix 



At = In 



A 



p F £TxT|Out(S)| 
c 9 



For the connection of node v we define 

B V (T) 4 [ 0| InW | x(T _ 1)S B v ] G Fl In WI><« T . 



We define matrix Ft which represent how the states are 
related to each other 



F T = 















F m 













F (3) 










FW 








Finally, we use matrix Ut that captures the time when 
transmissions start for each edge 



U T = 



U(l) 



U(T) 



q 



Using the above definitions, we can rewrite (fTTT i as follows 
Y T = U T (A t M t X + F T Y T ) , 

Z V (T) = B V {T)Y T - 

This equation can be solved to find the input-output transfer 
matrix at time T which results in 

Z V (T) = [B V (T)(I - UtFt^UtAtMt] X, (12) 

" v ' 

H S ,(T) 

where Hg v (T) £ ¥ q In ^ v ^ xn . From the definition of matrix 
Ft, we know that it is a "strictly lower triangular matrix" 
which means Ft is nilpotent and we have Ft — 0. The same 
applies for the matrix UtFt, namely we have (UtFt) t = 
0. So the matrix (I — UtFt)^ 1 has an inverse which is equal 
to 

(J - UtFtY 1 = (/ + ••• + (C/ t Ft) T_1 ) ■ 

Finally, note that if the nodes do not wait before starting the 
transmission (r v — : Vu 6 V), then we will have Ut = 

A. Proof of Theorem |2] 

For simplicity, in the following proof, we assume that each 
edge of the network has capacity 1. Edges with capacity more 
than 1 can be modeled by replacing them with multiple edges 
of unit capacity. 

From (fT2l the transfer matrix from S to v at time T is 
equal to Hg v (T). Knowing that the min-cut of node v is c v , 
we choose a set of c v incoming edges to v such that there exist 
c v edge disjoint paths from S to v and find the input-output 
transfer matrix just for this set of edges. Then we can write 

H Sv (T) = B V (T)(I - UtFt)- x U t A t M t (13) 
= B V {T) (/+••• + {UTFrf- 1 ) U t A t Mt, 

where H Sv (T) G F^ x " and B V (T) G F^ X « T . Let f£ k) 

denote for the entries of F% an d m if denote for the entries 
of M(t). Every node in the network performs random linear 
network coding so and f^' k ^ (those that are not zero) 
are chosen uniformly at random from ¥ q . 
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From (fOT l we know that each entry of Hs v (T) is a 
For T > to(v) where to(v) = nmx 



polynomial of degree at most T in variables m[f and 



we know that 



there exists a trivial solution for variables m\j and 
(which simply routes c v packets from S to v through the c v 



c (t,fe) 



edge disjoint paths) that results in 

Hsv{T) = [ I Cv Oc v x(n-c v ) 



(14) 



Note that by changing the routing solution (in fact by chang- 
ing the variables m|*- properly) we could change the place 
of identity matrix in (fl4l i arbitrarily. We conclude that the 
determinant of every c v x c v submatrix of H Sv{T) (which 
is a polynomial of degree at most c v T in variables m|* and 
fij ) ^ s not identical to zero. So by using the Schwartz-Zippel 
lemma ||36l we can upper bound the probability that Hsv {T) 



is not full rank if the variables m;„- 
uniformly at random as follows 

P [rank H S v{T) < c v 



and fjj'® are chosen 



< 



c v T 



We can apply the same argument for k < 
timeslots to show that 



consecutive 



raukH Sv (T :T + k — l)< kc v 



< 



kc v (T + k) 



where 



H Sv (T) 



H Sv (T :T + k-l 

H Sv (T + k-l) 
Now let us define the event Ak(v) as follows 

Ak{v) : rankH sv(T : T + k — 1) = kc v . 



Then we can write 

P[n vev A k {v)} = l 

> l 

> l 



[u veV Al(v) 



k(T + k) ^ 

vev 



q 



where T > to and to = max^gy to(v). 

This means that assuming q is large enough we are sure 
that with high probability each node v receives c v innovative 
packets per time slot for t > to- 



